-
Notifications
You must be signed in to change notification settings - Fork 387
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak #73
base: master
Are you sure you want to change the base?
Fix memory leak #73
Conversation
@j3soon Hi, this is an interesting issue. How many steps should be run to reproduce the OOM issue? Does it exist? |
|
Great finding. Cool, do you think the issue is from the actions that are not detached from the graph? |
Yes. PyTorch maintains a computation graph during the forward pass to record the tensor operations. When the loss is defined and we perform During training, the computation graph is released since we do perform |
@j3soon Fantastic finding. I will try it. Very good. |
actions
should either be detached from the computation graph or be converted into a numpy array before being stored into the replay buffer. In the original code, the entire computation graph for generating the action isn't released and consume unnecessary amount of memory. And may cause OOM if the program has been ran for a long time.